## Introduction

We provide code for our proposed AttnPath model here.

## Code Structure

`all_process_no_pretraining_dynamic_attention_lstm_reward_shaping_select_path_three_regularization_methods.py` is our main program. It first trains the model, and then evaluates on two tasks: fact prediction and link prediction.

`utils.py` provides some help functions, including teacher forcing and removing rings from navigating paths.

`networks.py` contains all the models used in our experiments.

`env.py` is the definition of environment (i.e, knowledge graph) where the agent navigates. It also contains code for initialzation, selecting an action and walk forward.

In the BFS folder, `KB.py` is the definition of some data structures in the knowledge graph.

`BFS.py` defines the algorithm of breadth first search, which is used to judge whether a reasoning path is valid for a triple.

## Usage

Our programs are all written in Python 3.6, using PyTorch 0.4.1 as deep learning architecture. So, please install Python and PyTorch first to run our programs.

Data could be downloaded from https://github.com/xwhan/DeepPath. They should be placed in the father directory of these code.

Usage:
python all_process_no_pretraining_dynamic_attention_lstm_reward_shaping_select_path_three_regularization_methods.py [parameters]

Possible parameters includes:

`-a [int]`: The dimension of the graph attention part of state vectors. Default 100.

`-wr [float]`: The reward for an invalid action. Default -0.05.

`-ur [float]`: The reward shaping coefficient for a list of actions which doesn't lead to ground truth tail entities. Default 0.01.

`-tr [float]`: Reward of global accuracy for a successful episode. Default 1.0.

`-gw [float]`: Weight for global accuracy. Default 0.1.

`-lw [float]`: Weight for path efficiency. Default 0.8.

`-dw [float]`: Weight for path diversity. Default 0.1.

`-np [int]`: Number of episodes used for training. Default 500.

`-wd [float]`: Base rate for L2 regularizaion. Default 0.005.

`-d2 [float]`: Base rate for Dropout. Default 0.3.

`-adr [float]`: Base rate for action dropout. Default 0.3.

`-eb [float]`: The base number used to compute difficulty coefficient. Default e.

`-r [str]`: Which relation to train. Its name is the same as the folder name of this relation in the data folder.

`-t [str]`: No need to set it. Keep it "retrain" is OK.

`-hue [int]`: Whether to update the hidden state every step even the agent selects an invalid action? Default 0 (No).

`-mo [str]`: Which model is used to compute the embedding part of state vectors? Default TransD.

`-remo [str]`: Which model is used to compute reward shaping? Default ConvE.
